In this paper we study the properties of the quenched pressure of a multi-layer spin-glass model (a deep Boltzmann Machine in artificial intelligence jargon) whose pairwise interactions are allowed between spins lying in adjacent layers and not inside the same layer nor among layers at distance larger than one.We prove a theorem that bounds the quenched pressure of such a K-layer machine in terms of K Sherrington–Kirkpatrick spin glasses and use it to investigate its annealed region. The replica-symmetric approximation of the quenched pressure is identified and its relation to the annealed one is considered. The paper also presents some observation on the model’s architectural structure related to machine learning. Since escaping the annealed region is mandatory for a meaningful training, by squeezing such region we obtain thermodynamical constraints on the form factors. Remarkably, its optimal escape is achieved by requiring the last layer to scale sub-linearly in the network size.

Annealing and replica-symmetry in deep Boltzmann machines / Alberici, Diego; Barra, Adriano; Contucci, Pierluigi; Mingione, Emanuele. - In: JOURNAL OF STATISTICAL PHYSICS. - ISSN 0022-4715. - 180:(2020), pp. 665-677. [10.1007/s10955-020-02495-2]

Annealing and replica-symmetry in deep Boltzmann machines

Adriano Barra;
2020

Abstract

In this paper we study the properties of the quenched pressure of a multi-layer spin-glass model (a deep Boltzmann Machine in artificial intelligence jargon) whose pairwise interactions are allowed between spins lying in adjacent layers and not inside the same layer nor among layers at distance larger than one.We prove a theorem that bounds the quenched pressure of such a K-layer machine in terms of K Sherrington–Kirkpatrick spin glasses and use it to investigate its annealed region. The replica-symmetric approximation of the quenched pressure is identified and its relation to the annealed one is considered. The paper also presents some observation on the model’s architectural structure related to machine learning. Since escaping the annealed region is mandatory for a meaningful training, by squeezing such region we obtain thermodynamical constraints on the form factors. Remarkably, its optimal escape is achieved by requiring the last layer to scale sub-linearly in the network size.
2020
Spin glasses; Boltzmann machines; Machine learning; Thermodynamical constraints
01 Pubblicazione su rivista::01a Articolo in rivista
Annealing and replica-symmetry in deep Boltzmann machines / Alberici, Diego; Barra, Adriano; Contucci, Pierluigi; Mingione, Emanuele. - In: JOURNAL OF STATISTICAL PHYSICS. - ISSN 0022-4715. - 180:(2020), pp. 665-677. [10.1007/s10955-020-02495-2]
File allegati a questo prodotto
File Dimensione Formato  
Alberici_Annealing_2020.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 312.96 kB
Formato Adobe PDF
312.96 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1707785
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 21
  • ???jsp.display-item.citation.isi??? 18
social impact